250 research outputs found
CLIC: Curriculum Learning and Imitation for object Control in non-rewarding environments
In this paper we study a new reinforcement learning setting where the
environment is non-rewarding, contains several possibly related objects of
various controllability, and where an apt agent Bob acts independently, with
non-observable intentions. We argue that this setting defines a realistic
scenario and we present a generic discrete-state discrete-action model of such
environments. To learn in this environment, we propose an unsupervised
reinforcement learning agent called CLIC for Curriculum Learning and Imitation
for Control. CLIC learns to control individual objects in its environment, and
imitates Bob's interactions with these objects. It selects objects to focus on
when training and imitating by maximizing its learning progress. We show that
CLIC is an effective baseline in our new setting. It can effectively observe
Bob to gain control of objects faster, even if Bob is not explicitly teaching.
It can also follow Bob when he acts as a mentor and provides ordered
demonstrations. Finally, when Bob controls objects that the agent cannot, or in
presence of a hierarchy between objects in the environment, we show that CLIC
ignores non-reproducible and already mastered interactions with objects,
resulting in a greater benefit from imitation
CURIOUS: Intrinsically Motivated Modular Multi-Goal Reinforcement Learning
In open-ended environments, autonomous learning agents must set their own
goals and build their own curriculum through an intrinsically motivated
exploration. They may consider a large diversity of goals, aiming to discover
what is controllable in their environments, and what is not. Because some goals
might prove easy and some impossible, agents must actively select which goal to
practice at any moment, to maximize their overall mastery on the set of
learnable goals. This paper proposes CURIOUS, an algorithm that leverages 1) a
modular Universal Value Function Approximator with hindsight learning to
achieve a diversity of goals of different kinds within a unique policy and 2)
an automated curriculum learning mechanism that biases the attention of the
agent towards goals maximizing the absolute learning progress. Agents focus
sequentially on goals of increasing complexity, and focus back on goals that
are being forgotten. Experiments conducted in a new modular-goal robotic
environment show the resulting developmental self-organization of a learning
curriculum, and demonstrate properties of robustness to distracting goals,
forgetting and changes in body properties.Comment: Accepted at ICML 201
SLOT-V: Supervised Learning of Observer Models for Legible Robot Motion Planning in Manipulation
We present SLOT-V, a novel supervised learning framework that learns observer
models (human preferences) from robot motion trajectories in a legibility
context. Legibility measures how easily a (human) observer can infer the
robot's goal from a robot motion trajectory. When generating such trajectories,
existing planners often rely on an observer model that estimates the quality of
trajectory candidates. These observer models are frequently hand-crafted or,
occasionally, learned from demonstrations. Here, we propose to learn them in a
supervised manner using the same data format that is frequently used during the
evaluation of aforementioned approaches. We then demonstrate the generality of
SLOT-V using a Franka Emika in a simulated manipulation environment. For this,
we show that it can learn to closely predict various hand-crafted observer
models, i.e., that SLOT-V's hypothesis space encompasses existing handcrafted
models. Next, we showcase SLOT-V's ability to generalize by showing that a
trained model continues to perform well in environments with unseen goal
configurations and/or goal counts. Finally, we benchmark SLOT-V's sample
efficiency (and performance) against an existing IRL approach and show that
SLOT-V learns better observer models with less data. Combined, these results
suggest that SLOT-V can learn viable observer models. Better observer models
imply more legible trajectories, which may - in turn - lead to better and more
transparent human-robot interaction
Enhancing Agent Communication and Learning through Action and Language
We introduce a novel category of GC-agents capable of functioning as both
teachers and learners. Leveraging action-based demonstrations and
language-based instructions, these agents enhance communication efficiency. We
investigate the incorporation of pedagogy and pragmatism, essential elements in
human communication and goal achievement, enhancing the agents' teaching and
learning capabilities. Furthermore, we explore the impact of combining
communication modes (action and language) on learning outcomes, highlighting
the benefits of a multi-modal approach.Comment: IMOL workshop, Paris 202
Automatic Context-Driven Inference of Engagement in HMI: A Survey
An integral part of seamless human-human communication is engagement, the
process by which two or more participants establish, maintain, and end their
perceived connection. Therefore, to develop successful human-centered
human-machine interaction applications, automatic engagement inference is one
of the tasks required to achieve engaging interactions between humans and
machines, and to make machines attuned to their users, hence enhancing user
satisfaction and technology acceptance. Several factors contribute to
engagement state inference, which include the interaction context and
interactants' behaviours and identity. Indeed, engagement is a multi-faceted
and multi-modal construct that requires high accuracy in the analysis and
interpretation of contextual, verbal and non-verbal cues. Thus, the development
of an automated and intelligent system that accomplishes this task has been
proven to be challenging so far. This paper presents a comprehensive survey on
previous work in engagement inference for human-machine interaction, entailing
interdisciplinary definition, engagement components and factors, publicly
available datasets, ground truth assessment, and most commonly used features
and methods, serving as a guide for the development of future human-machine
interaction interfaces with reliable context-aware engagement inference
capability. An in-depth review across embodied and disembodied interaction
modes, and an emphasis on the interaction context of which engagement
perception modules are integrated sets apart the presented survey from existing
surveys
A reduced reference image quality metric based on feature fusion and neural networks
A Global Reduced Reference Image Quality Metric (IQM) based on feature fusion using neural networks is proposed. The main idea is the introduction of a Reduced Reference degradation-dependent IQM (RRIQM/D) across a set of common distortions. The first stage consists of extracting a set of features from the wavelet-based edge map. Such features are then used to identify the type of degradation using Linear Discriminant Analysis (LDA). The second stage consists of fusing the extracted features into a single measure using Artificial Neural Networks (ANN). The result is a degradation- dependent IQM measure called the RRIQM/D. The performance of the proposed method is evaluated using the TID 2008 database and compared to some existing IQMs. The experimental results obtained using the proposed method demonstrate an improved performance even when compared to some Full Reference IQMs
A domain adaptive deep learning solution for scanpath prediction of paintings
Cultural heritage understanding and preservation is an important issue for
society as it represents a fundamental aspect of its identity. Paintings
represent a significant part of cultural heritage, and are the subject of study
continuously. However, the way viewers perceive paintings is strictly related
to the so-called HVS (Human Vision System) behaviour. This paper focuses on the
eye-movement analysis of viewers during the visual experience of a certain
number of paintings. In further details, we introduce a new approach to
predicting human visual attention, which impacts several cognitive functions
for humans, including the fundamental understanding of a scene, and then extend
it to painting images. The proposed new architecture ingests images and returns
scanpaths, a sequence of points featuring a high likelihood of catching
viewers' attention. We use an FCNN (Fully Convolutional Neural Network), in
which we exploit a differentiable channel-wise selection and Soft-Argmax
modules. We also incorporate learnable Gaussian distributions onto the network
bottleneck to simulate visual attention process bias in natural scene images.
Furthermore, to reduce the effect of shifts between different domains (i.e.
natural images, painting), we urge the model to learn unsupervised general
features from other domains using a gradient reversal classifier. The results
obtained by our model outperform existing state-of-the-art ones in terms of
accuracy and efficiency.Comment: Accepted at CBMI2022 graz, austri
An Inter-observer consistent deep adversarial training for visual scanpath prediction
The visual scanpath is a sequence of points through which the human gaze
moves while exploring a scene. It represents the fundamental concepts upon
which visual attention research is based. As a result, the ability to predict
them has emerged as an important task in recent years. In this paper, we
propose an inter-observer consistent adversarial training approach for scanpath
prediction through a lightweight deep neural network. The adversarial method
employs a discriminative neural network as a dynamic loss that is better suited
to model the natural stochastic phenomenon while maintaining consistency
between the distributions related to the subjective nature of scanpaths
traversed by different observers. Through extensive testing, we show the
competitiveness of our approach in regard to state-of-the-art methods.Comment: ICIP202
- …